
Our analysis is using the data from (Dell and Querubin, 2018), which can be found in their supplementary material. 

## The role of each directory

- `OriginalData`: This folder contains a copy of the data in the supplementary material of (Dell and Querubin, 2018). User should first copy their replication data into this folder.
- `Database`: It contains all the data we used in policy learning, including the data after preprocessing and the MCMC samples.
  - `2way3way.db`: A database contains the 2-way and 3-way decision tables used in the Hamlet Evaluation System (HES). It's the same as hes70_dectab_2way.csv and hes70_dectab_3way.csv in the supplementary material of (Dell and Querubin, 2018).
    - The user should then run the code `Construct_policy_table.py` to produce the numpy arrays `table2_array.npy` and `table3_array.npy`, which are the 2-way and 3-way decision tables stored as numpy arrays. These files are used to simplify the code in the later analysis.
  - The user should then run `Produce_hes_bomb_lca.py`, which will produce a database `alldata.db`. It has three tables, HES, bomb_1, and lca_1. These three tables correspond to the data for the HES, airstrike, and outcomes for policy learning. 
  - Finally, the user should run `Preprocessing.py`, which will produce a database `analysisdata.db`. This database contains the data ready for policy learning. It has a table Data1969_09, which contains the submodel scores and intermediate scores in the HES measured in September 1969, as well as the outcomes measured in January 1970.
- `MCMC`: This folder contains the code for drawing posterior samples of the Gaussian Process.
  - Users should run `gibbs.py` to sample from the posterior of the Gaussian Process. The samples will be saved in the folder `Database/MCMCSample`.
- `PolicyLearning`: This folder contains the code for learning the policy (decision tables). 
  - `SingleLevel.py`: This file contains the code for learning a one-level aggregation of the HES decision table. It will produce a figure `fig1` in the folder `PolicyLearning/fig1`.
  - `ThreeLevel.py`: This file contains the code for learning the whole 3-level aggregation of the HES decision table. It will produce a figure `PDP` in the folder `Visualization/PDP`.
  - `Utility.py`: This file contains utility functions and classes used in `SingleLevel.py` and `ThreeLevel.py`. It includes the optimization based on topological sorts for learning the policy.
- `Visualization`: This folder contains code for generating other illustrative and descriptive plots used in the paper.
  - `HES_structure`: This folder contains code for plotting the structure of the HES hierarchical aggregation.
  - `DAG`: This folder contains code for plotting the Directed Acyclic Graph (DAG) representing the 2-way and 3-way decision tables.
  - `DescriptiveAnalysis.py`: This file contains some descriptive statistics for the data.
  - `PDP`: This folder contains the plot of the partial dependence importance in the analysis for the learned policy.


## References

Melissa Dell, Pablo Querubin, Nation Building Through Foreign Intervention: Evidence from Discontinuities in Military Strategies, The Quarterly Journal of Economics, Volume 133, Issue 2, May 2018, Pages 701–764, https://doi.org/10.1093/qje/qjx037